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ABSTRACT 

Using a combination of N-body simulations, semi-analytic models and radiative trans- 
fer calculations, we have estimated the theoretical cross power spectrum between 
galaxies and the 21 cm emission from neutral hydrogen during the epoch of reioniza- 
tion. In accordance with previous studies, we find that the 21 cm emission is initially 

correlated with halos on large scales ( ~ 30 Mpc), anti-correlated on intermediate 

(~ 5 Mpc), and uncorrelated on small ( ~ 3 Mpc) scales. This picture quickly changes 
as reionization proceeds and the two fields become anti-correlated on large scales. The 
normalization of the cross power spectrum can be used to set constraints on the aver- 
age neutral fraction in the intergalactic medium and its shape can be a tool to study 
the topology of reionization. When we apply a drop-out technique to select galaxies 
and add to the 21 cm signal the noise expected from the LOFAR telescope, we find 
that while the normalization of the cross power spectrum remains a useful tool for 
probing reionization, its shape becomes too noisy to be informative. On the other 
hand, for a Lya Emitter (LAE) survey both the normalization and the shape of the 
cross power spectrum are suitable probes of reionization. A closer look at a specific 
planned LAE observing program using Subaru Hyper-Suprime Cam reveals concerns 
about the strength of the 21 cm signal at the planned redshifts. If the ionized fraction 
at z ~ 7 is lower that the one estimated here, then using the cross power spectrum 
may be a useful exercise given that at higher redshifts and neutral fractions it is able 
to distinguish between two toy models with different topologies. 

Key words: cosmology: observations — reionization — galaxies: formation — inter- 
galactic medium 



1 INTRODUCTION 

The epoch of reionization (EoR) is considered one of the 
great observational frontiers in astronomy today. It forms 
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the crucial bridge between the epoch of recombination and 
the galaxies we currently observe. Since it was the era of 
first substantial galaxy formation, it provides the context 
in which to understand the local universe. The reionization 
process itself likely had a direct impact on further galaxy 
formation and growth, primarily due to the dramatic change 
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in temperature of the Intergalactic Medium (IGM) caused 
by the photo-ionization. 

The investigation of the EoR takes place on both theo- 
retical and observational fronts. From the theoretical per- 
spective, an alytic models of r e ioniza tion have been em- 
ployed since lArons fc Wingertl i| 1972ft and iHogan fc Reesl 
(1 19791) and improved versions are still being developed 
fe.g. iFurlanetto fc Loebl 120051 : IChoudhurv fc Ferraral l200fJ : 
iBolton fc Haehnertll2007l ) . While these models display a high 
degree of sophistication, the non-linear nature of the feed- 
back of reionization upon further galaxy formation is not 
typically captured. Approaches based on (semi-) numerical 
methods can incorporate such and other complexities, as for 
example the three-dimensional effects of shadowing and the 
overlap of ionized regions. This allows them to treat more 
of the physics properly, although it comes of course at the 
cost of computing time. 

All these theoretical investigations have considerably 
advanced our understanding of the progress of reioniza- 
tion, in particular the effect of inhomogeneities in the ra- 
diation field, the relative importance of minihalos, quasars, 
and regular galaxies, and the topology of reionization 



(e.g. Ciardi et al. 2000; Gncdin 2000; Razoumov ct al. 2002 



Ciardi et all 120031; ISokasian et~afl 120031; llliev et all l200rj; 



20111 ; ICiardi et al.H201ll) 



Kohler et ail 120071; IZahn et alj l2007l ; iThomas fc Zaroubil 



However, numerical simulations also have their limita- 
tions. To make such calculations feasible for a representative 
sample of the Universe, several assumptions/simplifications 
must be made. These simplifications are a result of both 
the uncertainty in the physics (e.g., star formation effi- 
ciency, properties of the ionizing sources, escape fraction) 
and the finite computing resources which entail a certain 
finite resolution for the simulation. Hence, to span the pa- 
rameter space of interest, methods typically resort to vari- 
ous Monte Carlo and post-processing technique s that do not 
capture feedback effects self-co nsistently (e.g. ICiardi et al.l 
l2000l : lThomas fc Zaroub"Hl201ll ). 

On the observational front, probing the EoR directly 
has been difficult and to date our main constraints on 
the time interval during which reionization occurred are 
the Thomson scatt ering optical depth at high redshift 
l|Komatsu et alj 1201 ll ) and the absence o f Gunn-Peterson 
troughs at the lo wer redshif t end (e.g. iFan et al.l 120061 : 
iBecker et al.ll2007l but see also lSchroeder et al .112012 for the 
detection of Gunn-Peterson damping wings) . 

In pursuit of a direct detection of the EoR, a number of 
instruments in various phases of development will be used to 
attempt to detect neutral hydrogen via its 21 cm hyperfine 
transition. PAPEB0 LOFAB0, 21CMA0, GMRlfJ MWA0, 
and eventually SKA|j all hope to detect the 21 cm signal 
from the EoR. It is not only the detection of the trend in the 
decline (from higher to lower redshift) of the global neutral 
hydrogen content that will be informative. Also of interest 
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will be the spatial distribution/fluctuations of the 21 cm 
signal at a given redshift. 

In these early attempts, detecting the 21 cm signal 
from the EoR will be extremely challenging. To allevi- 
ate some of the problems, several cross-correlation anal- 
yses with observations in other wavelength regimes have 
been proposed. Under the assumption that the noise and 
uncertainties will be mitigated by using two observations 
of such different character, we can hope to put con- 
straints on the nature of reionization and hence gain fur- 
ther insights into the processes active during the EoR. 
In recent years several authors have undertaken theo- 
retical studies of the cross-correlation analysis of 21 cm 
measurements with other observations. C orrelations with 
the c o smic microwave background ( C MB) | Salvatcrr a~et al.l 
| 2005l; lAdshead fc Furlanettd 12008; iBerndsen et al.l l20ld ; 
I Jelic et al.ll2010f ). galaxy surveys (|Lidz et al.ll2009r ) and CO- 
emission surveys (|Lidz et alJl201lT ) have already been pro- 
posed. This type of analysis is particularly timely also in 
view of the exciting pro gress made in the observation of 
high- reds hift galaxies fe.g. lOuchi et al] [2010; Bo uwens et alj 
|2012| and references therein), which is promising to provide 
a large, statistically significant sample of such objects in the 
near future. 

In this paper, we make predictions for the observation 
of the cross-correla tion between the 21 cm and galaxy fields 
along the lines of iLidz et all (|2009h . but tailored towards 
the LOFAR-EoR experiment and the future high-redshift 
SubarrQ galaxy surveys. Using a dark matter simulation 
and an efficient radiative transfer code, we begin by cross- 
correlating the distribution of dark matter halos with the 
distribution of the 21 cm signal. We continue by using a 
well-studied semi-analytic model for galaxy formation and 
evolution to populate the halos with galaxies, thereby in- 
corporating realistic detection and identification limits for 
the galaxies. We also add the expected noise characteristics 
from LOFAR to the 21 cm signal to determine the use of 
galaxy-21cm cross-correlation for detecting and characteriz- 
ing reionization. 

This paper is organized as follows: ^specifies the dark 
matter simulation, radiative transfer code and the method 
used to construct the cross power spectrum. In |3]we calcu- 
late the cross power spectrum without imposing any obser- 
vational limitations, in order to find the theoretically pre- 
dicted best possible scenario for the detection. In f|4]we see 
how introducing more realistic specifications for both the 
21 cm and the galaxy survey modifies the theoretical result. 
Finally, in fj5] we discuss these results and the viability of 
performing such a cross-correlation in practice. 



2 METHOD 

Following ILidz et al.l (|2009l ). we define the 3D cross power 
spectrum between the 21 cm emission and the galaxies as: 



Al llgal (fc) 



A 



) [A^ gal (fc) + A^ gal (fc) 



+A^ gal (fc)] 



(1) 
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The cross power spectrum is thus made up from the sum 
of three other cross power spectra, the neutral fraction- 
galaxy cross power spectrum, A 2 , gal , the density-galaxy 
cross power spectrum, A 2 ,, and the neutral density- 
galaxy cross power spectrum, A 2 pgal . In Equation [TJ while 
gai W i s the unnormalized cross power spectrum, we de- 
fined A| x gal (fc) such that the 21 cm brightness tempera- 
ture relative to the CMB for neutral gas at the mean den- 
sity of the universe 5Tbo is scaled out (normalized cross 
power spectrum), since the 21 cm field can be given by 
S 2 i(r) = <m (2;Hi}(l + S x {r))(l + S p (r)), where (x m ) is 
the mean volume averaged neutral fraction and Si(r) rep- 
resents the spatial field i with respect to its mean, e.g. 
Si(r) = (i(r)— (*})/(*)• For the neutral hydrogen field, i refers 
to im(r), the fraction of hydrogen that is neutral at position 
r, while for the galaxy field, i refers to n ga i(r), the number 
density of galaxies at r. Finally, we work with the dimension- 
less cross power spectrum, i.e. A 2 b (k) — fc 3 P a ,i,(fc)/(2-7r 2 ) 
for the 3D power spectrum and A 2 i6 (fc) = 2k 2 P a ,b{k) for 
the 2D power spectrum, where P a> b is the dimensional cross 
po wer spectrum bet ween fields a and b. We refer the reader 
to iLidz et ail l|2009h for a more detailed discussion of the 
three terms in Equation [T] 

In order to construct the cross power spectrum, we 
therefore require three fields, the density field, the neutral 
hydrogen field, and the galaxy field. 

For this work we make use of the well-studied Millen- 
nium Simulation (|Springel et al.ll2005h . It is a dark mat- 
ter simulation featuring 2160 3 particles in a 500/1" 1 Mpc 
comoving box run from z = 127 down to z = 0. It was 
run in a ACDM cosmology with (fi m , Qa, Qbh 2 , h, as, n) — 
(0.25, 0.75, 0.024, 0.73, 0.9, 1.), which implies a particle mass 
of 1.2 x 1O 9 M0/i _1 . We have scaled the cosmology to the 
more re cent WMAP7 measur ements found in (an early ver- 
sion of) iKomatsu et al.l (feOlll 'FI - (fi m , Qa, fit ft 2 , h, as, n) = 
(0.272, 0.728, 0.02246, 0.702, .807,0.961) - in acco rdance 
with the method described in lAneulo fc White! <|2010h . scal- 
ing the output redshift, distance coordinates and particle 
masses. All quantities are transferred to a 256 3 grid, using 
a cloud-in-cell scheme for the density and galaxy fields. 

Halos with masses greater than 10 10 Mq (correspond- 
ing to a limit of about 20 particles) are selected as sources; 
not only do we reduce resolution effects with such a cut, 
ILidz et al.l ()2009h used a similar limit for the minimum mass 
of halos containing galaxies detectable by their hypothetical 
survey. The spectrum assumed is that of a young, metal-poor 
stellar population whose spectral ener gy distribution (SEP ) 



was determined using starburst99 (jLeitherer et aL 19991) 



with an absolute metallicity of 0.0004 and a Kroupa ( Kroupal 
2001) initial mass function. The SED is scaled according to 
the mass of the halo. The escape fraction of ionizing photons 
from each halo is taken to be 10%. The density and source 
field from the simulation, to gether with the SEP , are given 
as input to the BEARS code ^Thomas et all l2009h to calcu- 
late the neutral hydrogen field. BEARS is a radiative transfer 
code that, given the luminosity of a source and its spectrum, 
calculates a spherically averaged density profile around the 



8 The published version of the paper used an updated version of 
the RECFAST code and thus arrived at slightly different parame- 
ters. 



source and embeds a spherically symmetric ionization bub- 
ble. These bubbles are drawn from a catalogue of IE radia- 
tive transfer results of various types of spectra, luminosities, 
redshifts and density profiles. Their size is limited according 
to the lifetime of the sources. The code deals with overlap- 
ping HII regions by increasing the sizes of the bubbles in- 
volved in the overlap in such a way that the volume matches 
that of the overlap regions, hence conserving photons. The 
reionization histories calculated using BEARS give a value of 
the Thompson scattering optical depth which is ~ 0.09 and 
within the 1-a error bar of the WMAP3 estimate. For more 
details of the ID radiative transf er code, the implementatio n 
of bears and i ts ext ensio ns, see Thomas fc Zaroubil |2008), 
iThomas etaLI (|2009h and lThomas fc Zaroubil (|2011h . 

It has to be noted that the resolution of the simulations 
does not allow us to resolve the population of galaxies that 
are thought to be responsible for the production of the ma- 
jority of the ionizing photons during reionization, which re- 
side in halos that cool via atomic and mole cular transitions 
roughly in the range of 1 6 - 10 9 M Q (e.g. iMunoz fc Loebl 
l201ll : IRaicevic et aDl201lh . 

The output of the BEARS code is the neutral fraction 
throughout our simulation volume at different redshifts. To 
calculate the 21 cm-galaxy cross power spectrum we need 
to calculate the 21 cm differen tial brightness temp erature, 
which is defined as follows (e.g. IThomas et aUl2009l ): 



ST b (r) 



19mK(l + 5(r)) 



f x m (r) 
\ h 



Tea 



Ts(r) 



H(z)/(l + z) 



di!||/dr|| 
( l + z \ / 0-272 \ 

V 10 J U m J 



n b h 2 

0.02246 

1/2 



(2) 



where 5(r) is the matter overdensity at position r, Tomb is 
the CMB temperature, T 3 (r) is the spin temperature, and 
the other symbols have their usual meanings. We make the 
approximations that T s (r) 3> Tcmb everywhere and that the 
peculiar velocities do not con tribute, such th at the fourth 
and fifth terms are unity (see iMao et al]|201ll . for a discus- 
sion on the contribution of peculiar velocities to the 21 cm 
power spectrum). 

In Fig. [T] we show the spherically averaged 3D 21 cm 
power spectrum for the redshifts we will concentrate on 
throughout this work. The normalization of the power spec- 
trum decreases as the age of the universe (and mean ion- 
ized fraction) increases, the large scale power being the last 
to decrease. Initially there is much power on small scales 
due to the clustered density profile. At later times, however, 
the small-scale power diminishes as the large-scale power 
remains. 



THE THEORETICAL 21 CM 
POWER SPECTRUM 



HALO CROSS 



Before concerning ourselves with what will be detected by 
upcoming observations, it is useful to understand the intrin- 
sic behaviour of the 21 cm - galaxy cross power spectrum. 
To study this, we use the dark matter halos that we de- 
scribed in the previous section to represent the galaxy field. 
Here we make no attempt to discern what will be detectable 
or identified as a galaxy in the specified redshift range. 
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Fig. [2] shows the spherically averaged 3D 21 cm - halo 
cross power spectrum for z = 8.32, when the mean fraction 
of neutral hydrogen is (xm) = 0.53. Here we have broken it 
down into the three components listed in Equation [T] where 
the black solid curve is the final result. It is worth remark- 
ing that, while A 2 (fc) P;ga i is always positive, A 2 (k) XPiga i is 
always negative and A 2 (fc) Ijga i is negative for k where there 
are no oscillations. This implies an anti-correlation is mea- 
sured for the last two fields. 

Taking each term in turn, the density-galaxy cross 
power spectrum, A 2 (fc) Piga i, is the most straightforward 
component to interpret. On small scales, these two fields 
correlate very strongly, reminding us that halo formation is 
biased to high density regions. The power decreases, how- 
ever, towards large scales at which halos are much less aware 
of the density. The increase of power is roughly two orders 
of magnitude over two orders in magnitude in scale. 

The reionization process in our simulation proceeds in 
an 'inside-out' fashion, that is high density regions are typ- 
ically ionized earlier than low density regions. At the red- 
shift we are considering here large-scale underdense regions 
are still mostly neutral and free of galaxies, whereas the 
overdense regions surrounding the galaxies have all become 
ionized. This leads to an anti-correlation between the galaxy 
and neutral fraction fields, the strength of which increases 
with decreasing scale, as illustrated by the behaviour of the 
A 2 (fc)a;.g a i term in Fig. [2] Depending on the details of the 
model and the redshift, a turn around manifests with a pos- 
itive correlation on scales at which the galaxies correlate 
with the density field. On sub-bubble scales the correlation 
is expected to die off because the interior of an HII region is 
ionized independently from the local galaxy field. The typi- 
cal size of the ionized bubbles is then imprinted on the cross 
power spectrum at the smallest scales with an oscillatory 
behav iour. This behaviour is less pronounced in lLidz et al.l 
(2009) because, despite the minimum resolved halo in their 
reference simulation is similar to our, to follow the reion- 
ization history they also incorporate with a semi-analytic 
prescription smaller halos down to the atomic cooling mass. 
As a result their reionization topology is less dominated by 
large bubbles. Since bubbles are a generic feature of reion- 
ization, we do not expect that the oscillations will disappear 
altogether in the high resolution li mit, but would te nd to- 
wards the noise-like shape found in iLidz et al l l|200gh . 

Finally, the A 2 (k) XPtga \ term, which is not very signif- 
icant on large scales, serves to cancel the galaxy - density 
cross power spectrum almost entirely at small scales (recall 
that they have different signs). This effect is particularly 
strong for fc-modes for which the size of the ionization bub- 
ble is imprinted. 

All three terms added give the measurable term, the 
21 cm - galaxy cross power spectrum. Its shape is mainly 
determined by the A 2 (fc) p , ga i and A 2 (fc) I . ga i terms on large 
scales, and almost completel y by the la t ter at small scales. 
This confirms the findings of lLidz et al.l (|20091 l and we refer 
the reader to their work for further discussion. 

To better understand the behaviour of the cross power 
spectrum, we now study two toy models. For the first model 
we begin with a neutral universe and place ionized spheres 
with radius approximately 8h _1 Mpc around halos with 
mass greater than 10 10 Mq (this value gives an ionized frac- 
tion of roughly 50%). The second model is the dual of the 
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Figure 1. The spherically averaged 3D 21 cm auto power spec- 
trum for various redshifts/mean neutral fractions in our simula- 
tions. Power (particularly at small scales) decreases with decreas- 
ing neutral fraction. 
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Figure 2. The spherically averaged 3D 21 cm — halo cross power 
spectrum for z = 8.32, corresponding to (xjji) = 0.53. Also shown 
are the components of the 21 cm - halo cross power spectrum. 

first: the background universe is taken to be ionized and the 
bubbles contain neutral hydrogen. For both we use the same 
density field and halo distribution as above, namely the one 
from redshift z = 8.32, where also the BEARS simulation had 
(x m ) « 0.5. 

We show the cross power spectrum and the cross- 
correlation coefficient for these models, as well as the pre- 
diction from BEARS (black solid line in Fig. [2} in Fig. [3] 
The first thing we notice is that all three curves look re- 
markably similar. In the toy models the oscillations begin 
at larger k since the characteristic size of neutral regions is 
larger compared to the radiative transfer approach. The bot- 
tom panel shows the cross-correlation coefficient, defined as 
r(k) = P 2 i, g ai(fc)/[f > 2i(fc)-P ga i(fc)] 1/2 . Here we confirm that 
for the neutral bubble model the 21 cm emission is strongly 
correlated with the halo positions on large scales. At smaller 
scales the halo density is uncorrelated with the 21 cm emis- 
sion since an 8ft" 1 Mpc bubble can contain many or one 
halo. 

Naturally, the size of our toy bubbles is arbitrary and 



The 21 cm - Galaxy Cross Power Spectrum 5 




0.01 0.10 1.00 0.01 0.10 1.00 

k (h Mpc~') k (h MpcT 1 ) 



Figure 3. The spherically averaged 3D 21 cm - halo cross power 
spectrum (upper panel) and cross-correlation coefficient (lower 
panel) for two toy models at Z. = 8.32, (ichi) ~ 0-5- The black 
solid line shows the result using BEARS, while the red dashed line 
shows a toy model in which Sh^ 1 Mpc ionized bubbles are placed 
around halos, and the blue dot-dashed line shows a scenario where 
Sh~ 1 Mpc neutral bubbles are placed around halos in a fully ion- 
ized universe. In these extreme cases the general behaviour in the 
cross power spectrum is similar. 

has been chosen so that the average neutral fraction is ~ 0.5 
for all models. If we were to use larger bubbles (they would 
need to be less ionized to maintain our restriction of a global 
ionized fraction of roughly 50%) we would expect the oscilla- 
tions to occur at slightly lower k, while we would expect the 
opposite for smaller bubbles. This implies that the charac- 
teristic size of the bubbles in the BEARS simulation is slightly 
larger than 8/i _1 Mpc, likely owing to the dominance of high 
mass sources. 

Just as the normalization of the 21 cm auto power spec- 
trum decreases with decreasing neutral fraction, so does the 
21 cm - halo cross power spectrum. This is seen in Fig. U 
where we plot the cross power spectrum from the BEARS re- 
sults for our selected redshifts. Note that the scale on which 
the oscillations occur increases with decreasing redshift as 
the HII regions grow, until the field is completely ionized. 
The power spectrum also becomes flatter as the correlations 
on small scales diminish. 

The lower panel of Fig. [4] shows the correlation coeffi- 
cient for the same set of redshifts. Here we see that at early 
times (z ~ 9), the signals are strongly anti-correlated on 
intermediate scales (k ~ 0.3 — 0.4). This is because the bub- 
bles are still small. As the ionized regions become larger, the 
signals become anti-correlated only on large scales. 

All of this conforms to the results found in Lidz et al.l 
(2009). We now turn to a more careful prediction of the 
observed signal. 



4 PREDICTIONS FOR THE OBSERVED 21 CM 
- GALAXY CROSS POWER SPECTRUM 

The spherically averaged 21 cm - galaxy (halo) cross power 
spectrum will not be directly measured by observations. In 
practice, the cross power spectrum will be circularly aver- 
aged after the galaxy field has been projected onto two di- 



Figure 4. The spherically averaged 3D 21 cm — halo cross power 
spectrum (upper panel) and correlation coefficient (lower panel) 
for various redshifts/mean neutral fractions in our simulations. 
The anti-correlation is very strong on medium scales, but dimin- 
ishes as the universe becomes more neutral. 

mensions, the galaxy sample will be constrained by some 
selection criteria, and there will be substantial instrumental 
effects for both the 21 cm signal and the gala xy sample. 

We make use of the lDe Lucia et al l l|200rj ) semi-analytic 
models (SAM) to generate the necessary galaxy data. This 
is a well studied model that roughly reproduces many z = 
observations. The semi-analytic galaxy catalogue contains 
both rest-frame and observer-frame magnitudes of many 
bands. To account for the change in cosmology, we scale 
the magnitudes of the galaxies with the same mass factor 
mentioned in section [2] and convert the wavelength of the 
bands to account for the shift in redshiftB In Table Q] we 
compare the original bands to the converted bands. We cau- 
tion that this model is designed to fit local universe data, 
and that its predictions for the high-redshift universe may 
not be correct. On the other hand, the SAM acts as a 'best 
guess' since fully analytic calculations would neglect the in- 
terplay between galaxies, while hydrodynamic simulations 
would likely u nderestimate the s tar formation rate at these 
redshifts (e.g. ISchave et alj|20ich . 

To perform our predictions, we consider a very large ob- 
servational survey. A particularly ambitious program would 
cover 3 square degrees and overlap with previously well- 
studi ed fields (e.g. t he Subaru Deep Field - |Kashikawa et al.l 
120041 and GOODS - iGiavalisco et al.ll2004l ). We consider sur- 
veys for two types of sources, high-redshift Lyman break 
galaxies (LBGs) and Lyman alpha emitters (LAEs). The for- 
mer uses broad band photometry combined with a drop-out 
technique to detect the rest-frame 912 A break. Such sur- 
veys typically use a st rong colour cut to di stinguish LBGs 
from other objects (e.g. lBouwens et al.l l2008). They have the 
advantage that existing filters can be used and combined 
with already well-studied fields. One large disadvantage is 
that precise redshifts cannot be determined without spec- 
troscopic follow-up. 

LAEs, on the other hand, are objects that emit very 

9 Some of t he calcu l ations in this work made use of the tool 
developed bv lWrightl J2006ft . 
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Table 1. Central wavelength of bands in the observer frame of 
reference from the original SAM catalogue (left column), the cat- 
alogue converted to account for a different cosmology (central) 
and observed bands (right). See text for details. 



Original SAM band 


Converted band 1 


Corresponding 






observed band 


u (3500 A) 


2947 A 


U (3600 A) 


g (4800 A) 


4042 A 


B (4400 A) 


r (6250 A) 


5263 A 


V (5500 A) 


i (7700 A) 


6485 A 


R (6400 A) 


2 (9100 A) 


7662 A 


z' (9100 A) 


J (12600 A) 


10610 A 


y (9860 A) 



1 The actual value varies slightly (±10 A) with redshift so we give 
approximate values here. 



Table 2. Number of galaxies in the simulation box (second col- 
umn), those selected as observable LBGs (third column), and 
those selected as LAEs (final column). See text for details. 



Redshift Total number Selected Selected 

of galaxies in the SAM LBGs LAEs 



9.06 


72791 


130 


153 


8.32 


189904 


351 


581 


7.66 


420591 


768 


1940 


7.04 


820361 


421 


5517 


6.48 


1436450 


1661 


13109 



strongly in the 1216 A Lya lin e. They are typica lly found 
using narrow-band surveys (e.g. lOuchi et al]|2010f ). Since a 
narrow filter is used, the redshifts of the objects are rather 
tightly constrained. The precise nature of LAEs is, however, 
currently unknown. 

4.1 Predictions for drop-out surveys 

We search for LBGs at all redshifts of i nterest, using the de - 
tection limits of the survey presented in lOuchi et al.l (2009). 
Their selection criteria were that an object must not be 
detected in their 4 blue continuum bands (U,B,V,R > 
27 .4, 28.0, 26.7, 27.0) and at the same time have a z - y 
colour greater than 1.5. We relax this colour criterion to 1, 
and instead of looking at the Lya trough as they did, we con- 
sider the Lyman break since the positioning of our observing 
bands is determined by the SAM and we can therefore not 
centre so precisely on the Lya trough. Table Q] shows how 
the SAM output bands are affected by the re dshift scaling 
and h ow they relate to the bands used for the lOuchi et al.l 
(|2009h selection. 

Table [2] gives the detection efficiency of our galaxies for 
each redshift of interest. Galaxies are most efficiently de- 
tected at redshift 7.66 ((xhi) = 0.21), likely due to a combi- 
nation of two factors. First, a greater fraction of galaxies are 
detectable than at higher redshift since the instruments can 
detect galaxies with a lower absolute magnitude, and sec- 
ond, there is a higher fraction of bright, star- forming galaxies 
than at lower redshift. The total number of detected galax- 
ies decreases at z = 7.04 ((shi) — 0.081) because for this 
redshift the Lyman break falls in the middle of one of the 
bands, so the colour difference cannot be efficiently detected. 




0.01 0.10 1.00 

k (h Mpc -1 ) 

Figure 5. The circularly averaged 2D 21 cm — galaxy cross power 
spectrum for z = 8.32, (xhi) = 0.53 for a dropout survey. Also 
shown are the components of the 21 cm — galaxy cross power 
spectrum. The general shape and normalization of the power is 
recovered, but a number of features are lost. 

In a drop-out survey without any spectroscopic follow- 
up, there will be no radial distance information and all of 
the objects will be projected onto the same plane. The filter 
width used for these telescopes is less than the width of 
our simulation box, so we choose to project only a random 
slab of the box with a thickness that roughly corresponds to 
1000 A, which is the typical full-width at half maximum of 
the response function that is used in photometric surveys. 
The SAM outputs a galaxy catalogue gridded according to 
a count-in-cell scheme. For each line of sight through the 
slab, we calculate the mean number density weighted by a 
Gaussian function (<r = 0.25Z, where / is the thickness of the 
slab) to approximate the filter response function since each 
filter has a different response function anvwavF°l. The 21-cm 
signal is an equally weighted average across the entire slab. 

In Fig. [5] we plot the projected, circularly averaged, 
21 cm - galaxy cross power spectrum predicted from our 
simulations for z — 8.32 ((aim) = 0.53 1 11 !. We have again 
broken it into the individual components from Equation [T] 
Note that for this figure we have not yet included the noise 
in the 21 cm signal. We notice immediately that the power 
spectra are much noisier, and the oscillations observed in 
Fig. [2] are absent. On the other hand, some general trends 
are retained, such as the increase of power with decreasing 
scale. Also as before, the gal (fc) term defines the shape for 
the 21 cm - galaxy cross power spectrum. The higher values 
obtained here for low k are most probably due to differences 
arising when calculating the 2D or 3D spectrum. 

Next we add noise appropriate to a 600 h observation 
with the LOFAR core to the redshifted 21 cm signal. The 
actual LOFAR station positions are used to generate uv 
tracks for a 4 h observation at the zenith. From this we de- 



10 We neglect the possibility of galaxies being coincidental along 
the line of sight since we expect the angular resolution to be high 
enough to make this effect minimal. 

11 Before our results depended mostly on the ionized fraction 
and the precise redshift was unimportant. The use of the SAM 
has now made our predictions somewhat redshift dependent. 
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Figure 6. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation coef- 
ficient (lower panel) for various redshifts/mean neutral fractions 
in our simulations in the dropout survey case. Although noisy, 
some of the general trends of the spherically averaged 3D power 
spectrum are recovered. 



Figure 7. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum calculated with (solid lines) and 
without (dashed lines) LOFAR noise in the dropout survey case. 
The noise inherent in the LOFAR instrument causes the two mea- 
surements to become similar. 



fine a sampling function S(u, v) which describes how densely 
the interferometer baselines sample Fourier space over the 
course of an observation, such that l/\/S is proportional 
to the noise on the measurement of the Fourier transform 
of the sky in each uv cell. We generate uncorrelated, com- 
plex Gaussian noise in each cell, but enforce the condition 
n(u, v)* = n(—u, —v) where n(u,v) is the noise in the cell 
with coordinates (u, v). This ensures that when we perform 
a two-dimensional Fourier transform of this noise realization 
to obtain a noise image, the image is real. The noise image 
is normalized to ensure it has a temperature rms 

\ 2r r 

_ A ± sys /„\ 

C* noise — . , \ } 

^-eff^bcam V 2^s(^s — l)UntAu 

where A is the observed wavelength, T sys is the observed 
temperature, A e ff is the effective area of a LOFAR station, 
n s is the number of stations, ti nt is the integration time, Av 
is the bandwidth and fibcam is the area of the synthesized 
beam. We take T sys = 140 + 60(^/300 MHz) -2 ' 55 K and 
use values for the effective area tabulated on the ASTRON 
LOFAR webpagQ 

Since some upcoming surveys are expected to cover very 
large fields (upwards of three square degrees, corresponding 
to ~ 60 proper Mpc at z = 6.48), we take 4 random slabs 
(for a square configuration these correspond to ~ 77 proper 
Mpc) through the box and average them to make our pre- 
dictions less susceptible to sample variance. 

In Fig. [6] we present the result for our output redshifts of 
interest. Please note that this figure shows the true measured 
(unnormalized) cross power spectrum between the galaxy 
field and the 21 cm emission, A2 lgal (&), as defined in Equa- 
tion!]] an d not the normalized one as in the previous figures. 
Very few of the trends seen in Fig. [4] are recovered. The be- 
haviour of increasing power with decreasing scale is main- 
tained, but the oscillations have all but disappeared. The 

12 http: / /www.astron.nl/radio-observatory/astronomers/lofar- 
imaging-capabilities-sensitivity/lofar-imaging-capabilities-and- 



normalization of the cross power spectrum increases on all 
scales with decreasing redshift, but the effect is small, such 
that it would be difficult to distinguish between reionization 
states using the cross power spectrum alone. The cross cor- 
relation coefficients (shown in the bottom panel) are quite 
noisy and provide little information. 

This indicates that using simple detection and selection 
techniques will make it extremely difficult to glean informa- 
tion from the 21 cm - galaxy cross power spectrum. In our 
case, we simply do not detect enough galaxies at the highest 
redshifts, too much information is lost in the projection of 
the galaxy field, and the observing noise is too large for a 
significant statement to be made. 

Fig. [7] shows the impact that the LOFAR noise has at 
2 = 6.48 (red lines) and at z = 8.32 (black lines). Here 
we see that in the absence of noise (dashed lines), the two 
redshifts appear to be quite distinct. When noise is intro- 
duced (solid lines), the curves appear more similar. While 
at large scales adding the noise always decreases the power 
spectrum, at small scales its effect depends on redshift. In 
fact, unlike for the auto power spectrum where adding the 
noise always adds to the power, in the cross power spectrum 
adding the noise moves the spectrum towards what it would 
be in the case where the galaxy field is crossed with a pure 
noise field. This means that at z = 8.23 the cross power spec- 
trum between the pure noise and galaxy fields is lower than 
the one of the 21 cm and galaxy fields, whereas at z = 6.48 
the 21 cm signal is reduced by a much larger factor than the 
noise is, resulting in opposite behaviour. This is seen only 
at small scales because these are the ones where the drop 
in signal is more significant (see Fig. [4| . In other words, af- 
ter the noise is introduced, the A 2 (k) xPig!l i and A 2 (k) XygB ,i 
terms have diminished influence, while the dominant term 
becomes A 2 (fc) Piga i. 

Finally, we revisit our toy models to determine if the 
observed cross power spectrum can tell us anything about 
the topology of reionization. We have used the same toy 
models mentioned in the previous section and have applied 
the same procedure described in this section. In Fig. [8] we 
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Figure 8. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation coef- 
ficient (lower panel) for two toy models at Z = 8.32, (xhi) = 0.53 
in the dropout survey case. The correlation coefficient is able to 
distinguish between the two scenarios. 
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Figure 9. The circularly averaged 2D 21 cm — galaxy cross power 
spectrum considering a LAE survey for z = 8.32, (xhi) = 0.53. 
Also shown are the components of the 21 cm — galaxy cross power 
spectrum. The general trend of Fig. [5] is maintained. 



see that these two very different reionization scenarios result 
in a similar cross power spectrum. There is certainly more 
power on medium to large scales in the neutral bubble case, 
indicating that the cross power spectrum could help identify 
different scenarios. One area in which these two scenarios 
were distinct in section [3] was the cross-correlation coeffi- 
cient. In the lower panel of Fig. [3]they were markedly differ- 
ent on large scales. In Fig. [HJ the difference is best noticed 
on medium scales. 



4.2 Predictions for LAE surveys 

lOuchi et all |20ld ) report how they refined their dropout 
technique using a narrow band filter to identify LAEs. They 
then obtained follow-up spectra to establish the redshifts 
of their galaxies. While only a narrow band in redshift is 
probed, the three dimensional positions of the objects can 
be established much more precisely. 

We can therefore follow a similar procedure as we did 
in the previous section, with a few slight modifications. The 
width of the narrow-band filter used bv lOuchi et al.l (|2010l ) 
is 132 A, which roughly corresponds to the a slab about 
half the thickness considered in the previous section. Since 
their filter is focused on a very narrow band that we do not 
have access to in the semi-analytic model, we cannot select 
based on the same colour criterion they used. Therefore, 
we use the star-forma tion rate to i ntrins ic Lya luminosity 
conversion reported in lDaval" et all |2008T l: 



2.80 x 10 4 erg s" 



SFR 



©yr" 



(4) 



This will be attenuated and modified by a number of 
factors: the escape fraction of ionizing photons, the frac- 
tion of Lya photons that are destroyed by dust, gas in- 
flows and outflows, and any intergalacti c absorption and 
scatt e ring (see e.g. Diikstra et al. 2007 1_ Kobavashi et al.l 



120071 ; iDiikstra fc Wvithd |2010| ; iJeeson-Daniel et all \20vS ) 
The latter is expected to be particularly relevant at these 
redshifts, when a substantial neutral fraction is expected, 



but it is beyond the scope of this paper to investigate this 
issue in more details. We defer further analysis t o the future. 
In ord er to select a similar number of galaxies as lOuchi et al.l 
(|2010T ) (see below), we assume the transmitted Lya luminos- 
ity to be a factor of 15 less than the intrinsic. 

lOuchi et all (|2010l ) report that they are sensitive to a 
Lya luminosity of 2.5 x 10 42 ergs _1 at z = 6.56. We have 
converted this using Equation [4] to act as a star-formation 
rate cut on our galaxies at each of our redshifts of inter- 
est. We use their detection limits to ensure that there is no 
continuum emission bluewards of the Lya break. 

In Table [2] we give the detection efficiency for the entire 
box. Since we only take a slab, the actual number of galaxies 
will be the number in the ri ght-hand column ad justed for the 
slab thickness. We note that lOuchi et alj l|2010l) detected 207 
LAEs at z « 6.6 in a plane of similar size to ours; given that 
we use a slab thickness of 11 (out of 256) at this redshift, we 
only slightly overestimate the number of detectable LAEs 
(albeit at a lower redshift). 

Fig. [9] is the analogue of Fig. [5] Most of the features 
found in the dropout cross power spectrum persist for the 
LAEs. Interestingly, the A 2 (k) Xi ga i term dominates even 
more here. Note that for this figure we have not included 
the noise in the 21 cm signal. 

In Fig. [TO] we show the 21 cm - LAE cross power spec- 
trum and correlation coefficient for a number of redshifts. 
Here we have again averaged the calculation over 4 ran- 
dom slabs in the simulation box and included the 21 cm 
noise. The result is similar to that in the drop-out case 
(Fig. [6} in that many of the features found in the theo- 
retical case (Fig. 2| are not recovered. The cross-correlation 
coefficient, though, shows more marked changes as reion- 
ization progresses. At high redshifts and moderate ionized 
fractions, the correlation coefficient is negative, but it turns 
positive for very low neutral fractions. The correlation co- 
efficient again appears to be the key to drawing meaningful 
conclusions from these two measurements. 

We briefly revisit the effect of noise from the LOFAR 
instrument on our measurements. In Fig. [TTJ we repeat the 
comparison we made in Fig. [7] Here we find that at low 
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Figure 10. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation coef- 
ficient (lower panel) for various redshifts/mean neutral fractions 
(see Fig. [4] for a legend) in our simulations in the LAE survey 
case. Slight differences in the cross power spectrum are seen for 
different redshifts. 



Figure 12. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation coef- 
ficient (lower panel) for two toy models at z = 8.32, (:ehi) = 0.53 
in the LAE survey case, ft is possible to distinguish between the 
two ionization topologies using both the cross power spectrum 
and the correlation coefficient. 
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Figure 11. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum calculated with (solid lines) and 
without (dashed lines) LOFAR noise in the LAE survey case. 
The addition of LOFAR, noise makes it difficult to detect the 
suppression of power on small scales as the redshift decreases. 

redshift, while there is an intrinsic suppression of power on 
small scales, the noise serves to mask that effect. On large 
scales the noise decreases the power for both redshifts. 

A LAE survey could yield useful information about the 
progress of reionization, but does it say anything about the 
topology of reionization? In Fig. [12] we return to our toy 
models and apply the same calculation machinery. The re- 
sult is a somewhat clearer distinction between the two reion- 
ization scenarios. At scales of roughly 5/i -1 Mpc, some dif- 
ference is noted in the cross power spectrum while at larger 
scales significant differences in the correlation coefficient ap- 
pear. 

Finally, we consider four specific observing strategies 
for the Subaru Hyper-Suprime Cam (kindly provided to us 
by Masami Ouchi). The details are outlined in Table [3] 
Note that while there are deep and ultra-deep observations 



Table 3. Characteristics of four observing strategies with the 
Subaru Hyper-Suprime Cam. 



Redshift 


Number 


Total area 


-^CK.min 




of fields 


(square degrees) 


(erg/s) 


7.3 


2 


3.5 


1.3 x 10 43 


6.6 


2 


3.5 


2.8 x 10 42 


6.6 


4 


28 


6.2 x 10 42 


6.6 


1 


4 


6.2 x 10 42 



planned for the z = 6.6 case, there is only an ultra-deep ob- 
servation planned for the z = 7.3 case. The current plan is 
to observe one of the z = 6.6 deep fields with LOFAR, but 
here we consider all of the cases shown in Table [3] 

Our outputs do not line up precisely with the redshifts 
for these planned observations, so we use the z = 6.48 and 
z — 7.04 outputs and expect that the difference would be 
minimal. Furthermore, we use a single slice of our simulation 
box to test the 3.5 and 4 square degree cases (corresponding 
to ~ 40 proper Mpc at z = 6.48) and average 9 slices for 
the 28 square degree case (corresponding to ~ 107 proper 
Mpc at z = 6.48). As for the estimates discussed earlier, we 
convert the Lya luminosities to equivalent star formation 
rates using Equation [4] 

The results are shown in Figs. 1131 and [T4l for z = 6.48 
and z = 7.04, respectively. The first noticeable thing is that 
it makes little difference when the Lya luminosity thresh- 
old or the observing area are changed by factors of a few. 
As we saw in the previous section, a major component in 
the shape of the curve is the LOFAR observing noise, not 
these two factors. Averaging over more fields does seem to 
smooth the curves slightly (although the curve in Fig. Q3] 
is smoother only by chance), but it is not clear that using 
a deeper or wider field will improve the measurement sig- 
nificantly. If our fiducial reionization scenario is reasonable, 
these observations might be better performed at higher red- 
shift in order to obtain a stronger 21 cm signal. 
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Figure 13. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation co- 
efficient (lower panel) for three specific Subaru Hyper-Suprimc 
Cam observing strategies at z = 6.48. Also shown by the solid 
line is the z = 6.48 analysis from Fig. 10. Little difference is seen 
between the four curves. 
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Figure 14. The circularly averaged, unnormalized 2D 21 cm - 
galaxy cross power spectrum (upper panel) and correlation coeffi- 
cient (lower panel) for the Subaru Hyper-Suprime Cam observing 
at z = 7.04. Also shown by the solid line is the z = 7.04 analysis 
from Fig. 10. Little difference is seen between the two curves. 



5 CONCLUSIONS 

We have endeavoured to make predictions for the 21 cm - 
galaxy cross power spectrum. Using our BEARS code, we have 
performed radiative transfer simulations on the well-studied 
Millennium Simulation. Beginning with the spherically av- 
eraged dimensionless cross power spectrum between 21 cm 
emission and dark matter halos, we investigated how these 
two fields relate before attempting to make predictions for 
observations. 

In general, we confirm the results of iLidz et a l. (2009), 
who made their own predictions for the 21 cm - halo cross 
power spectrum. We find a similar shape and normalization, 
but owing to our coarser grid and poorer resolution, we find 
that at large k our power spectrum shows oscillations related 
to the characteristic bubble size. 

The 21 cm emission is initially correlated with halos on 



large scales, anti-correlated on medium, and uncorrelated 
on small scales. This picture changes quickly as reionization 
proceeds and the two fields become anti-correlated on large 
scales. Through toy models it becomes apparent that these 
correlations can be an useful tool for inferring the topology 
of reionization. 

We th e n tak e the analysis in a different direction from 
ILidz et all (I2009T ). We attempt to make a more detailed 
mock observation of this signal in order to make predictions 
for upcoming surveys, using a well-studie d semi-analytic 
mode l of galaxy formation and evolution (|De Lucia et al.l 
120061) to bridge the gap between halos a nd galaxies. Apply - 
ing the drop-out technique used in the lOuchi et al.l (|2009h 
survey severely reduced the number of galaxies at our dis- 
posal. 

To further simulate the effect of observing and select- 
ing real galaxies, we considered only a slab of our simulation 
box corresponding to the typical width of a filter. We then 
projected this slice and circularly averaged the power spec- 
trum. We also added the noise expected from the LOFAR 
instrument to the 21 cm signal. 

The result is that while the shape of the cross power 
spectrum is nominally preserved, its normalization seems 
to be the most powerful tool for probing reionization. In 
particular, it is sensitive to the ionized fraction as we show 
that different reionization histories yield similar cross power 
spectra for the same ionized fraction. 

Compounding these problems is the fact that any 
galaxy survey will likely focus on a specific drop out range. 
We have seen that the cross power spectrum is quite useful 
when comparing the relative differences between different 
models or redshifts. In the absence of another redshift with 
which to compare results to, it might be troublesome to 
come up with a robust statement about reionization. 

We turned to a more precise measurement of the galaxy 
redshifts that would be found using a LAE survey. We found 
that if the radial position of the galaxy is known, then much 
more information about the nature of reionization can be 
gleaned from cross-correlating the galaxy and 21cm fields. 
Using a LAE survey could in principle allow one to describe 
reionization using both the shape and normalization of the 
cross power spectrum. 

A closer look at a specific planned LAE observing pro- 
gram using the Subaru Hyper-Suprime Cam reveals con- 
cerns about the strength of the 21 cm signal at the planned 
redshifts. If our estimate of the ionized fraction at z — 7 
is too high, then using the cross power spectrum might be 
a useful exercise given that at higher redshifts and neutral 
fractions it is able to distinguish between toy models with 
two different topologies. Indeed we predict that a detection 
of a correlation signal will be made - the main issue will be 
the interpretation of that signal. 

There are a few observational effects which we have 
not included in our analysis. On the galaxy side, we have 
neglected to mimic the effect of interlopers that could be 
misidentified as high-redshift galaxies. On the 21 cm side, we 
have assumed that the projected signal can be recovered on 
the angular scale of one of our resolution elements. A greater 
source of uncertaint y however, may be the effect of fore- 
ground removal (e . g. Ijelic et a.1.1 120081: iBernardi et al. Il2009l ; 
lHarker et aT]|2010l ; |jelic et alj|20ld : iPetrovic fc Ohll201lh . 

The prospect of combining upcoming galaxy surveys 
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with measurements of the 21 cm signal remains quite ex- 
citing. Such combinations should not only be able to tell us 
about the progress of reionization, but the topology and the 
main drivers as well. This may turn out to be a key exercise 
in understanding this epoch while true imaging of 21 cm 
maps remains pending. 
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